Project Outline

This report is Part 2 in a five part series in which we are exploring and analyzing ocean buoy data collected from NOAA maintained National Data Buoy Center (NDBC) stations. In Part 1 we explored ocean current observations at the NDBC Station 46087 (Neah Bay Buoy) and compared them with ocean current forecasts from a third party. Here in Part 2 we will look at meteorological (wind and wave) data from the Neah Bay Buoy and examined the potential for significant meteorological events to introduce noise in ocean current observations. In Part 3 we will introduce meteorological data for another location, NDBC Station 46088 (New Dungeness Buoy), and compare trends in wave height, period, and direction with those of the Neah Bay Buoy. We will attempt to highlight the relationship between swell events at the Neah Bay Buoy and swell events at the New Dungeness Buoy. In Part 4 we will walk through considerations and processes involved in training and testing a supervised ML model to predict the class of wave which might occur at the New Dungeness Buoy given conditions at the Neah Bay Buoy. In Part 5 we will put our final classifier model in production by supplying forecasted conditions for the Neah Bay Station and determining the predicted class of wave observed at the New Dungeness Station.

More detailed information regarding the NDBC, and the locations of buoys they maintain, can be found on their website.

Executive Summary: Part 2

In this report, we introduce meteorological data collected from NDBC Station 46087 (Neah Bay Bouy) at the West entrance to the Strait of Juan de Fuca. We explore summary statistics for mean aggregate values, noting seasonal trends in observations.

Then we introduce time series data visualization and gain further insight into seasonal trends for wind and wave data over years 2014 to 2019. We take a closer look at monthly observations for year 2016, where individual spikes in wave height, or ‘swell events’, become apparent.

Finally, we wrap up the data exploration by comparing instances of ‘erratic’ and ‘dampened’ ocean current observations, similar to those identified in Part 1 of this project, with meteorological data along the same timeline. We explore the elevated water temperatures during 2016, and their potential relation to increased marine growth. We conclude that the ocean current predictions are a fair and valid representation of how we could expect ocean current observations to behave in the absence of strong meteorological events (‘wind events’, ‘wave events’, etc).

Data

The data we will be exploring is available for download in nicely formatted, yearly ‘.txt’ files from the NDBC website here. There are a significant number of missing observations, with the range of available data spanning from 2004 through 2019. After compiling each year into a single dataset and performing routine data cleaning steps, I chose to engineer several new features. A formal definition and description of each feature is available in the appendix of this report, and details regarding measurement techniques utilized by the NDBC can be found here.

Exploring Meteorological and Swell Data

Before we dive deep into the granularity of the data, I believe it is important to obtain a better understanding of what data is present. We can do this by aggregating data for all months to obtain monthly averages:

Summary Statistics for all Months, NDBC Station 46087
Month Number of
Observations
Mean Wave
Dir
Mean Wave
Height
Mean APD
Mean DPD
Mean Wind
Dir
Mean WSPD
Mean PRES
Mean ATMP
Mean WTMP
1 17879 248.71 2.41 7.74 12.08 143.46 7.70 1017.37 6.83 8.34
2 16743 254.41 2.16 7.66 11.87 145.15 6.63 987.68 6.63 8.17
3 17354 258.87 2.02 7.54 11.41 164.79 5.92 1015.75 7.47 8.75
4 16846 265.48 1.97 7.55 11.25 187.24 5.23 1017.27 8.72 9.66
5 17933 267.55 1.53 6.89 9.81 208.42 3.85 1017.15 10.54 10.68
6 16917 269.30 1.39 6.65 9.12 215.32 3.23 1017.47 11.73 11.39
7 19932 271.54 1.27 6.46 8.82 218.95 2.91 1017.96 12.53 11.86
8 20044 275.57 1.25 6.49 8.53 210.02 2.79 1016.60 12.68 11.95
9 18949 271.16 1.50 7.11 9.72 172.20 3.43 1016.60 12.30 11.73
10 21140 262.29 1.94 7.63 11.00 146.57 5.33 1016.13 10.91 11.31
11 19899 256.23 2.31 7.47 11.15 151.80 7.05 1015.19 8.99 10.61
12 19187 258.39 2.51 7.72 12.02 144.63 7.25 1016.26 6.94 9.09

We can see that for the most part there is a consistent number of observations from month to month with the range of 16743 observations for the Month of February, and 21140 observations for the month of October. There is a lot of information here, and we will use data visualization techniques to explore this further. For now, consider this aggregation of data by year:

Summary Statistics for All Years, NDBC Station 46087
Year Number of
Observations
Mean Wave
Dir
Mean Wave
Height
Mean APD
Mean DPD
Mean Wind
Dir
Mean WSPD
Mean PRES
Mean ATMP
Mean WTMP
2004 4218 266.35 1.89 7.22 10.42 171.05 4.44 1016.30 11.13 11.37
2005 16072 270.09 1.89 7.19 10.75 173.41 4.71 1015.66 10.07 10.72
2006 13157 267.43 1.93 7.04 10.24 190.36 4.75 980.79 10.20 10.72
2007 17072 262.73 2.03 7.38 10.66 177.64 5.21 1016.88 9.29 9.92
2008 17029 261.82 2.14 7.60 11.17 184.19 5.04 1016.93 8.51 9.31
2009 15434 268.64 1.74 7.09 10.26 191.26 4.95 1016.83 9.36 9.84
2011 12363 270.41 1.74 7.16 10.35 188.45 4.23 1017.68 10.65 10.29
2012 12853 264.33 1.89 7.20 10.38 177.13 5.29 1015.15 9.04 9.68
2013 10249 263.40 1.89 7.41 11.29 164.05 5.67 1020.69 7.57 8.68
2014 17405 261.32 1.80 7.09 10.29 176.30 5.59 1016.42 10.07 10.70
2015 17463 264.50 1.85 7.28 10.44 172.09 5.00 1016.68 10.68 11.20
2016 17451 260.11 2.02 7.46 10.74 169.52 5.36 1015.63 10.64 11.23
2017 17332 258.14 1.76 7.04 10.10 168.77 5.43 1015.88 9.60 10.41
2018 17420 264.24 1.79 7.18 10.32 175.46 5.16 1017.07 9.97 10.58
2019 17305 261.97 1.69 7.25 10.99 159.91 4.93 1016.22 10.04 10.64

Year 2010 is missing altogether. Also, I notice that year 2004 only has about four thousand observations and year 2013 has about ten thousand. As we continue with our exploratory data analysis, it will be important to keep in mind that aggregations with less data provide a less accurate picture of what is actually happening during that period of time. The impact of having twice as much data for a given time period will allow us to glean a more accurate understanding of what is happening. In other words, be wary not to draw conclusions from comparisons between data aggregations with different levels of clarity (that is, significant differences in numbers of, and distributions of, observations).

In light of this, let’s continue with a look at Monthly Aggregate data for Wave Heights:

The months along the y-axis have been arranged in ascending order of average wave height, and values for average wave height appear along the x-axis. The size of the point corresponds to the average dominant period, while the color corresponds to the average direction for each month. Notice the distinct grouping of winter months between October and April, which tend to have larger average wave heights, larger dominant periods, and a more Southerly direction. Also notice the distinct group of summer months between May and September, which tend to have smaller average wave heights, smaller dominant periods, and a more Northerly direction. These seasonal swell patterns are common knowledge, and their presence adds validity to our data.

Consider the following display of the Yearly Aggregated Wave Height data:

The years along the y-axis have been arrange sequentially from 2004 to 2019, and values for average wave height appear along the x-axis. In this plot, the size of the point corresponds to the number of observations used to generate the aggregate means, while the color represents the average wave direction.

Now lets take a look at Monthly Aggregated Wind and Weather Data:

Here we see months arragned in ascending order of average wind speed along the y-axis and values for average wind speed along the x-axis. The size of the point corresponds to the atmospheric pressure, while the color of the point corresponds to the average wind direction for each month. Remember, it’s valid to draw conclusions from comparisons is this chart since each month has roughly an equal number of observations. Again we notice two distinct groups of winter and summer months. Winter months tend to have lower atmospheric pressure and stronger winds averaging from a more South Easterly direction, while summer months tend to have higher atmospheric pressure and lighter winds averaging from a more South Westerly direction.

Finally let’s look at Monthly Aggregated Air and Water Temperatures:

Here we see months arranged in ascending order of average water temperature along the y-axis, with values for average water temperature appearing on the x-axis. There appear to be three major groupings of months, with December through April on the lower end, November and May in the middle, and June through October showing the warmest average water temperatures. Also, there appears to be a strong positive correllation between mean water temperature and mean air temperature.

Time Series Exploration

Let’s take a different approach and explore this data in a series of yearly plots. Consider these visualizations of wave data for 2016 through 2019:

The height of a recorded wave is indicated on the y-axis, with the color of the point representing the direction the wave is coming from. There is some data missing for part of the spring and early summer of 2017. In general we see two primary colors, shades of green indicating a more Southwesterly direction and shades of blue indicating a more Northwesterly direction. Consistent with our previous data aggregations we see larger swell typically occuring during the winter months.

Let’s look at the Wind and Pressure data for the same date range:

Here the y-axis shows wind speeds in m/s. The color of the point corresponds to the direction the wind is coming from, and the size corresponds to the pressure at the time of the observation. The wind directions aren’t as cleanly organized as the wave direction data, but we do see trends in winter versus summer months.

Diving Deeper with Year 2016

Now let’s focus our attention on monthly observations of wave data for year 2016:

Each peak on these plots corresponds with a ‘swell event’. It is interesting to note how cleanly organized some of the swell events are, in comparison to the disorganized appearance of others. I wonder how they relate to concurrent wind data.

Let’s find out:

Comparing Meteorological and Ocean Current Data

A main objective of this report is to assess meteorological data during known discrepancies between observations and predictions of ocean currents. For the remainder of this exploration the goal will be to validate the ocean current prediction data by showing that erratic behavior in ocean current observations is visually correllated with major meteorolocial events (large swell, strong winds, etc). And that the ocean current predictions are a fair and valid representation of how we could expect the ocean current observations to behave in the absence of meteorological events. The underlying motivation for this task is to validate the use of the prediction data in supplying each meteorological observation with a predicted ocean current reading. This data will be used in developing a supervised machine learning model, and by supplying the model with ‘smooth’, ‘clean’ data, as oppossed to the ‘erratic’, ‘noisy’ ocean current observations data, we will be able to produce a model which generalizes well and allows for more accuracy in predicting future events.

Recall from Part 1 these predictions and observations of ocean current for 2016:

There are four distinct patterns that I notice in this overlay of predictions and observations data. First, there are periods where the observations are skewed in the positive direction, mainly during the winter months. Second, there are periods where the predictions and observations appear to line up very well: May and June. Third, there is a period during the summer when the observations appear ‘dampened’: Between July and September. Fourth, there are very erratic observations in the 300 cm/s range occuring in the last year of the month. We will explore all four of these trends.

But first, consider another representation of the same ocean current data for 2016:

In previous analysis we assumed that any current moving other than in a Westerly direction was a flooding current, and should therefore be treated as moving in a positive direction. Here we have highlighted an important feature, observed direction. It’s important to keep in mind that observed ocean currents don’t move in a strict, binary (ebb, flood) pattern as the predicted behavior suggests.

Also, recall from Part 1 that the source for the predictions data defined the mean predicted ebb direction as 290 deg true, and the mean predicted flood direction as 115 deg true. Comparing these values to the colors constructed from the observed dir field, we must keep in mind that these colors are constructed from a traditional definition of compass points in the N, E, S, and W directions with boundaries at 45 degrees between each compass point.

Let’s continue with our exploration to compare meteorological data with this new representation of predicted and observed current patterns.

First, let’s focus on February 2016:

Notice the ‘swell events’, which are somewhat aligned with the erratic current observations. Swell events from a more Southerly direction seem to align with positive current readings moving in a more Northerly direction. Likewise, swell events from a more Westerly direction seem to align with positive current readings moving in a more Easterly direction.

Let’s zoom in on February 1st to 8th, 2016:

Let’s take a minute to highlihgt some important considerations:

Typical predicted peak current speeds rarely exceed the range from -100 to 50 cm/s, where 100 cm/s is approximately equal to 2.24 MPH, or 1.94 knts.

There is an important distinction regarding the ‘direction’ description of wind and waves as opposed to ocean current. Wind and wave directions are referenced as the direction that they are coming from, while ocean current directions are described as the direction to which they are moving.

When comparing the timelines of these meteorological recordings and those of the ocean current observations it is important to remember that the swell recordings are aggregated readings from a “20-minute sample period” within the previous 30 minutes, while ocean current observations are observations of what is happening aggregated over a much shorter period of time. Review the NDBC Measurement Descriptions webpage for further details.

Another important consideration is the speed at which swell trains and individual waves travel, which affects the potential for energy transfer as a function of wave density. “The speed (in nautical miles per hour or kts) of an individual deep water wave is about 3 times it’s period (in seconds). That is, an individual wave with a 13 second period travels at 39 kts. Contrary to what you might intuitively think, there is a linear relationship between wave period and wave speed. But because most deep water waves move in groups, the group speed is half that of an individual wave (within the group), or in this example about 19.5 kts. As the wave moves into shallow water, the group speed and the individual wave speed become the same, so the individual wave starts traveling at the group speed, or 19.5 kts. This wave speed formula is approximate, and actually wave speeds are a fraction different,” as this website describes. I highly recommend spending a few minutes to read through that website, it covers a significant amount of pertinent information, like wave density, which we will forego for the purposes of this report.

If we walk through the timeline of ocean currents from the 1st to the 8th of February, 2016, we notice that the predicted range of currents start out approximately between -75 and 25, and builds in magnitude through the 8th where it ranges from -100 to 75. Observations of ocean currents tend to remain in the positive, or flood direction for the majority of the period, following the ebb predictions only during the strongest peak-ebb events. There are also three ‘spikes’ in the positive direction for observed ocean currents. These occur on February 3rd, 4th, and 5th/6th, corresponding to powerful groundswell events in the 10 to 15ft range with a period of 14 to 18 seconds.

Moving on let’s shift our attention to April 2016:

April is an interesting example, showing two main periods of discrepancies between ocean current predictions and observations. The first half of the month shows a strong series of predicted current exchanges with a maximum range of about -120 to 70 around April 10th. The second half of the month shows a smaller magnitude in predicted current exchanges, as well as a larger discrepancy in ocean current predictions and observations.

Let’s focus our attention on April 21st to 26th, 2016:

Notice the combination of Southerly winds and Southwest swell on April 23rd, and observed current readings mixed between North and East. Also, notice the combination of Northwest wind and Northwest swell on April 25th, and observed current readings moving in a Southerly direction.

Moving on, let’s explore May 2016:

There are two medium sized swell events occuring around May 8th to 9th and 18th to 21st. The initial period of the month, with larger current exchanges and a more Northerly swell event, shows observed ocean current observations with a more southerly direction. The second period of the month, with a slightly smaller current exchange and a more Southerly swell event, shows observed ocean current observations with a more Notherly reading. Notice also, that the first swell event has a shorter period, and therefore less density, than the second swell event.

Moving Forward, let’s look at July 2016:

What is happening here? Why are the ocean current observations so dampened?

Consider the aggregate mean water temperatures for all years compared with mean water temperatures for 2016:

Summary Statistics for Given Month, NDBC Station 46087, Temperatures in Degrees Celsius
Month Mean ATMP
(All Years)
Mean WTMP
(All Years)
Mean ATMP
(2016)
Mean WTMP
(2016)
1 6.83 8.341478 8.02 9.284761
2 6.63 8.172372 8.98 10.007626
3 7.47 8.747199 9.17 10.272293
4 8.72 9.655497 10.75 11.029079
5 10.54 10.677676 11.47 11.275913
6 11.73 11.387261 12.15 11.771208
7 12.53 11.861870 12.94 12.259020
8 12.68 11.948263 13.12 12.441842
9 12.30 11.732813 12.24 11.819357
10 10.91 11.307150 11.83 12.384589
11 8.99 10.606280 11.24 12.695649
12 6.94 9.093702 5.80 9.553108

And here a visualization showing the differences in mean water temperatures for 2016 as compared to aggregate mean water temperature for all years:

As we can see, water temperatures for 2016 were well above normal.

I believe the elevated water temperatures during the early part of 2016 led to an increase of marine growth for the summer months. In particular, this increase of marine growth may have fouled the Aanderaa Doppler Current Sensor (ADCS) instrumentation attached at a depth of 1.6 meters to the NDBC Station 46087, leading to the ‘dampened’ effect of ocean current observations. Even with an anti-fouling coating and routine maintenance, a permanent ocean fixture could easily become covered in marine growth during the peak of the season exacerbated by higher than average water temperatures. NOAA describes their ‘Data Quality Control Checks and Procedures’ in this pdf document, with relevant ocean current quality control definitions beginning on page 25. Anyone who has left a boat in the ocean through the summer months understands how quickly this plant, barnacle and mussel growth can accumulate. For those not familiar with this phenomenom a quick search engine query will provide examples.

Let’s examine the last month of the year, December 2016:

Let’s focus on the erratic behavior of the observed ocean current between the 17th and 19th of December:

I do not notice an apparent correlation between meteorological events and these extremely erratic oberserved current readings. However, I do notice that there is an increase in Northerly observed ocean current readings as the large swell event from the Southwest fills in and peaks on December 19th.

Summary, Conclusions, and Next Steps

We have explored aggregate meteorological and swell data from years 2004 to 2019, and confirmed the presence of anticipated trends along seasonal boundaries in wind speed and direction, as well as wave height, direction, and period. We have explored this data on a granular level, and examined correllations (not necessarily causations) in swell events with discrepancies between ocean current predictions and observations. We have explored the presence of unusually warm water temperatures during 2016, and made conjectures regarding ‘dampened’ ocean current observation recordings. And finally, we have explored unusually erratic ocean current observations and did not find any correllation with meteorological events.

In examining these occurrences of deviation between predicted and observed ocean currents, I have shown correllation with strong swell events; I have shown adherence to predicted flows during strong current exchanges; I have conjectured regarding correllation with potential algeal growth interferrence; and ultimately, I have shown that the ocean current prediction data is a valid baseline upon which we could expect to observe ocean currents in the absence of interferring factors (noise: meteorological events, marine growth, data transmission errors, etc). In conclusion, I feel validated in moving forward with utilizing the ocean current predictions data in a supervised machine learning model.

In part 3 of this project we will explore data from the NDBC Station 46088, also known as the New Dungeness Buoy, located at the East entrance to the Strait of Juan de Fuca. Our intent will be to explore meteorological data from this station to determine a list of dates on which swell was recorded travelling through the strait. We will use this list of dates to explore ocean current and meteorological data at the Neah Bay Buoy, NDBC Station 46087, and look for trends in conditions. This will lead us to part 4 of the Exploring Ocean Buoy Data project, where we will explore features through statistical analysis to help us train a supervised machine learning model to predict the class of wave occuring at the New Dungeness Buoy, given forecasted conditions at the Neah Bay Buoy.

Appendix

Data Definitions

Summary of Meteorological Data:

##      id           Date_Time                        WVHT            MWD       
##  46087:222823   Min.   :2004-07-09 00:00:00   Min.   :0.27    Min.   :  1.0  
##                 1st Qu.:2008-04-18 12:05:00   1st Qu.:1.19    1st Qu.:253.0  
##                 Median :2013-03-13 06:20:00   Median :1.67    Median :270.0  
##                 Mean   :2012-09-14 14:50:12   Mean   :1.85    Mean   :263.4  
##                 3rd Qu.:2016-10-16 08:05:00   3rd Qu.:2.32    3rd Qu.:281.0  
##                 Max.   :2019-12-31 23:20:00   Max.   :9.93    Max.   :360.0  
##                                               NA's   :74158   NA's   :73144  
##       dir              swell_type         DPD             APD       
##  W      :68220   groundswell:21752   Min.   : 2.94   Min.   : 3.12  
##  WNW    :31651   windswell  :70805   1st Qu.: 8.33   1st Qu.: 6.27  
##  WSW    :28081   windwave   :55872   Median :10.81   Median : 7.13  
##  SW     :13090   chop       :  236   Mean   :10.54   Mean   : 7.24  
##  SSW    : 3344   flat       :    0   3rd Qu.:12.12   3rd Qu.: 8.14  
##  (Other): 5293   NA's       :74158   Max.   :23.53   Max.   :14.27  
##  NA's   :73144                       NA's   :74158   NA's   :74158  
##       WDIR           w_dir            WSPD             GST        
##  Min.   :  1.0   E      :35703   Min.   : 0.000   Min.   : 0.000  
##  1st Qu.: 99.0   ESE    :33459   1st Qu.: 2.600   1st Qu.: 3.400  
##  Median :173.0   W      :19962   Median : 4.400   Median : 5.600  
##  Mean   :175.9   WSW    :17478   Mean   : 5.089   Mean   : 6.467  
##  3rd Qu.:254.0   WNW    :16939   3rd Qu.: 7.200   3rd Qu.: 8.900  
##  Max.   :360.0   (Other):96688   Max.   :23.600   Max.   :31.000  
##  NA's   :2594    NA's   : 2594   NA's   :588      NA's   :1104    
##       PRES           ATMP             WTMP            DEWP        
##  Min.   :   0   Min.   :-3.900   Min.   : 3.60   Min.   :-14.000  
##  1st Qu.:1013   1st Qu.: 7.800   1st Qu.: 9.00   1st Qu.:  5.100  
##  Median :1017   Median :10.100   Median :10.50   Median :  8.400  
##  Mean   :1015   Mean   : 9.774   Mean   :10.36   Mean   :  7.753  
##  3rd Qu.:1021   3rd Qu.:12.000   3rd Qu.:11.60   3rd Qu.: 11.000  
##  Max.   :1045   Max.   :22.000   Max.   :21.40   Max.   : 18.300  
##  NA's   :468    NA's   :738      NA's   :1302    NA's   :28334

Glimpse of Meteorological Data:

## Rows: 222,823
## Columns: 16
## $ id         <fct> 46087, 46087, 46087, 46087, 46087, 46087, 46087, 46087, ...
## $ Date_Time  <dttm> 2004-07-09 00:00:00, 2004-07-09 01:00:00, 2004-07-09 02...
## $ WVHT       <dbl> 0.69, 1.57, 1.43, 1.52, 1.31, 1.45, 1.35, 1.27, 1.28, 1....
## $ MWD        <int> 278, 278, 252, 288, 283, 278, 276, 264, 280, 280, 276, 2...
## $ dir        <fct> W, W, WSW, WNW, WNW, W, W, W, W, W, W, W, WNW, WNW, W, W...
## $ swell_type <fct> windwave, windwave, groundswell, windwave, windwave, win...
## $ DPD        <dbl> 5.56, 8.33, 13.79, 8.33, 8.33, 8.33, 8.33, 13.79, 8.33, ...
## $ APD        <dbl> 4.15, 6.42, 6.40, 6.47, 6.42, 6.88, 7.13, 6.76, 7.19, 7....
## $ WDIR       <int> 268, 269, 295, 261, 286, 278, 283, 292, 277, 255, 233, 2...
## $ w_dir      <fct> W, W, WNW, W, WNW, W, WNW, WNW, W, WSW, SW, WSW, WNW, E,...
## $ WSPD       <dbl> 2.4, 2.2, 1.7, 4.9, 4.2, 4.0, 4.9, 4.6, 3.7, 2.0, 2.9, 1...
## $ GST        <dbl> 3.0, 2.9, 2.2, 5.7, 4.9, 4.7, 5.9, 5.4, 4.7, 2.5, 4.0, 2...
## $ PRES       <dbl> 1015.8, 1015.0, 1014.4, 1014.1, 1013.9, 1013.8, 1013.6, ...
## $ ATMP       <dbl> 12.0, 11.6, 11.6, 11.9, 11.9, 11.9, 12.0, 12.0, 11.8, 11...
## $ WTMP       <dbl> 11.4, 11.3, 11.8, 11.8, 11.5, 11.6, 11.6, 11.6, 11.5, 11...
## $ DEWP       <dbl> 11.4, 11.1, 11.1, 11.3, 11.3, 11.3, 10.9, 10.8, 10.9, 10...

Here we will walk through a definition and short description for each field:

  • id indicates the location. NDBC Station ID 46087 refers to the Neah Bay Buoy.
  • Date_Time is the year, month, day, and time of the recorded observation. Observations are recorded twice hourly then stored in GMT/UTC timezone by the NDBC.
  • WVHT is defined by the NDBC website as, “Significant wave height (meters) is calculated as the average of the highest one-third of all of the wave heights during the 20-minute sampling period.”
  • MWD is defined by the NDBC website as, “The direction from which the waves at the dominant period (DPD) are coming. The units are degrees from true North, increasing clockwise, with North as 0 (zero) degrees and East as 90 degrees.”
  • dir is a feature I engineered using the data from MWD. Values follow the standard notation for cardinal direction, more information on cardinal direction can be found here.
  • swell_type is a feature I engineered using DPD. Values indicate whether a given observation is classified as ‘groundswell’, having a dominant wave period of greater than or equal to 13 seconds, or ‘windswell’, having a dominant wave period of less than 13 seconds but greater than or equal to 10 seconds, or ‘windwave’, having a dominant period less than 10 seconds but greater than 4 seconds, or ‘chop’, having a dominant period 4 seconds or smaller, or ‘flat’, having dominant period equal to 0 with a wave height of 0. For more information on swell mechanics see this website.
  • DPD is defined by the NDBC website as, “Dominant wave period (seconds) is the period with the maximum wave energy.”
  • APD is defined by the NDBC website as, “Average wave period (seconds) of all waves during the 20-minute period.”
  • WDIR is defined by the NDBC website as, “Wind direction (the direction the wind is coming from in degrees clockwise from true N) during the same period used for WSPD.”
  • w_dir is a feature I engineered using the datat from WDIR and the same value definitions as dir.
  • WSPD is defined by the NDBC website as, “Wind speed (m/s) averaged over an eight-minute period for buoys.”
  • GST is defined by the NDBC website as, “Peak 5 or 8 second gust speed (m/s) measured during the eight-minute or two-minute period. The 5 or 8 second period can be determined by payload.”
  • PRES is defined by the NDBC website as, “Sea level pressure (hPa).”
  • ATMP is defined by the NDBC website as, “Air temperature (Celsius).”
  • WTMP is defined by the NDBC website as, “Sea surface temperature (Celsius). For buoys the depth is referenced to the hull’s waterline.”
  • DEWP is defined by the NDBC website as, “Dewpoint temperature taken at the same height as the air temperature measurement.”

Further details regarding measurement techniques utilized by the NDBC can be found here.